Active learning by label uncertainty for acoustic emotion recognition
نویسندگان
چکیده
Speech data is in principle available in large amounts for the training of acoustic emotion recognisers. However, emotional labelling is usually not given and the distribution is heavily unbalanced, as most data is ‘rather neutral’ than truly ‘emotional’. In the ‘hay stack’ of speech data, Active Learning automatically identifies the ‘needles’, i.e., the more informative instances to reduce human labelling effort when building a classifier, e.g., for acoustic emotion recognition. The critical issue thus is the determination and quantification of informativeness. To this end, we suggest to exploit the reliability of the usual ambiguity of emotional labels, i.e., we propose a novel approach based on label uncertainty. By building a certainty model and predicting the candidate instances, informativeness is thus based on labeller agreement. In addition, we consider class sparseness. The results of extensive test runs under well standardised conditions show the method’s great potential in reducing labelling costs while boosting performance.
منابع مشابه
High-level feature representation using recurrent neural network for speech emotion recognition
This paper presents a speech emotion recognition system using a recurrent neural network (RNN) model trained by an efficient learning algorithm. The proposed system takes into account the long-range contextual effect and the uncertainty of emotional label expressions. To extract high-level representation of emotional states with regard to its temporal dynamics, a powerful learning method with a...
متن کاملActive Learning by Sparse Instance Tracking and Classifier Confidence in Acoustic Emotion Recognition
Data scarcity is an ever crucial problem in the field of acoustic emotion recognition. How to get the most informative data from a huge amount of data by least human work and at the same time to obtain the highest performance is quite important. In this paper, we propose and investigate two active learning strategies in acoustic emotion recognition: Based on sparse instances or based on classif...
متن کاملReusing Neural Speech Representations for Auditory Emotion Recognition
Acoustic emotion recognition aims to categorize the affective state of the speaker and is still a difficult task for machine learning models. The difficulties come from the scarcity of training data, general subjectivity in emotion perception resulting in low annotator agreement, and the uncertainty about which features are the most relevant and robust ones for classification. In this paper, we...
متن کاملDBN-ivector Framework for Acoustic Emotion Recognition
Deep learning and i-vectors have been successfully used in speech and speaker recognition recently. In this work we propose a framework based on deep belief network (DBN) and ivector space modeling for acoustic emotion recognition. We use two types of labels for frame level DBN training. The first one is the vector of posterior probabilities calculated from the GMM universal background model (U...
متن کاملActive learning for dimensional speech emotion recognition
State-of-the-art dimensional speech emotion recognition systems are trained using continuously labelled instances. The data labelling process is labour intensive and time-consuming. In this paper, we propose to apply active learning to reduce according efforts: The unlabelled instances are evaluated automatically, and only the most informative ones are intelligently picked by an informativeness...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2013